Today we will work on the following graph from the article Emissions Are Surging Back as Countries and States Reopen -

co2 emissions

I downloaded the dataset as an Excel file and saved data for individual countries as csv files.

import altair as alt
import pandas as pd
ind = pd.read_csv('ind_co2_em.csv')
ind = ind.iloc[1:]

chn = pd.read_csv('china_co2_em.csv', sep=';')
chn = chn.iloc[1:]

us = pd.read_csv('us_co2_em.csv', sep=';')
us = us.iloc[1:]

euuk = pd.read_csv('euuk_co2_em.csv', sep=';')
euuk = euuk.iloc[1:]

globl = pd.read_csv('global_co2_em.csv', sep=';')
globl = globl.iloc[1:]

data = pd.concat([chn, ind, euuk, us, globl])

data['DATE']  = pd.to_datetime(data['DATE'],format='%d/%m/%Y')
data[['PWR_CO2_MED','IND_CO2_MED','TRS_CO2_MED','PUB_CO2_MED','RES_CO2_MED','AVI_CO2_MED']] = data[['PWR_CO2_MED','IND_CO2_MED','TRS_CO2_MED','PUB_CO2_MED','RES_CO2_MED','AVI_CO2_MED']].apply(pd.to_numeric)
#hide_output
data.head()

If you observe the chart closely you will realize that the graph is stacked, so that is what we will do right away.

alt.Chart(data).mark_area().encode(
     x=alt.X('DATE:T'),
     y=alt.Y('TOTAL_CO2_MED:Q'),
     color=alt.Color('REGION_NAME:N'),#,scale=alt.Scale(scheme='reds')),
).properties(width=800, height=400)

This is close but not exactly like what we saw in the article. If you look closely you'd realize that the order of countries is different. So we will try to follow the same order using the order encoding field.

alt.Chart(data).mark_area().transform_calculate(order="{'CHN': 0, 'IND': 1, 'EUandUK': 2, 'USA': 3, 'GLOBAL': 4}[datum.REGION_CODE]").encode(
     x=alt.X('DATE:T'),
     y=alt.Y('TOTAL_CO2_MED:Q'),
     color=alt.Color('REGION_CODE:N'),#,scale=alt.Scale(scheme='reds')),
     order='order:O'
).properties(width=800, height=400)

This is exactly like it. Let's change the colors, I probably would have done it the following way -

alt.Chart(data).mark_area().transform_calculate(order="{'CHN': 0, 'IND': 1, 'EUandUK': 2, 'USA': 3, 'GLOBAL': 4}[datum.REGION_CODE]").encode(
     x=alt.X('DATE:T'),
     y=alt.Y('TOTAL_CO2_MED:Q'),
     color=alt.Color('REGION_CODE:N',scale=alt.Scale(domain=['CHN', 'IND', 'EUandUK', 'USA', 'GLOBAL'], range=["#c9c9c9", "#aaaaaa", "#888888", "#686868", "#454545"])),
     order='order:O'
).properties(width=800, height=400)

To make it just like the graph in the article, we will pick colors from here https://imagecolorpicker.com/en/

alt.Chart(data).mark_area().transform_calculate(order="{'CHN': 0, 'IND': 1, 'EUandUK': 2, 'USA': 3, 'GLOBAL': 4}[datum.REGION_CODE]").encode(
     x=alt.X('DATE:T'),
     y=alt.Y('TOTAL_CO2_MED:Q'),
     color=alt.Color('REGION_CODE:N',scale=alt.Scale(domain=['CHN', 'IND', 'EUandUK', 'USA', 'GLOBAL'], range=["#fde9d1", "#fcd08b", "#f9b382", "#e38875", "#ac7066"])),
     order='order:O'
).properties(width=800, height=400)

If you look closely, you would notice that we are capturing the trend perfectly, however the area for "REST of the world" is much more than what it should be.
That is because, its duplicating the data from US, EU, India, and China. So we need to subtract the contributions of these places from the global data and then stack them.

chn['DATE']  = pd.to_datetime(chn['DATE'],format='%d/%m/%Y')
ind['DATE']  = pd.to_datetime(ind['DATE'],format='%d/%m/%Y')
us['DATE']  = pd.to_datetime(us['DATE'],format='%d/%m/%Y')
euuk['DATE']  = pd.to_datetime(euuk['DATE'],format='%d/%m/%Y')
globl['DATE']  = pd.to_datetime(globl['DATE'],format='%d/%m/%Y')
ind[list(ind.columns)[5:]] = ind[list(ind.columns)[5:]].apply(pd.to_numeric)
chn[list(chn.columns)[5:]] = chn[list(chn.columns)[5:]].apply(pd.to_numeric)
us[list(us.columns)[5:]] = us[list(us.columns)[5:]].apply(pd.to_numeric)
euuk[list(euuk.columns)[5:]] = euuk[list(euuk.columns)[5:]].apply(pd.to_numeric)
globl[list(globl.columns)[5:]] = globl[list(globl.columns)[5:]].apply(pd.to_numeric)
countries_sum = ind[list(ind.columns)[5:]]+chn[list(chn.columns)[5:]]+us[list(us.columns)[5:]]+euuk[list(euuk.columns)[5:]]
rest = globl[list(globl.columns)[5:]] - countries_sum[list(countries_sum.columns)]
rest['REGION_ID'] = 99
rest['REGION_CODE'] = 'RST'
rest['REGION_NAME'] = 'REST'
rest['TIME_POINT'] =  globl['TIME_POINT']
rest['DATE'] = globl['DATE']
data = pd.concat([chn, ind, euuk, us, rest])
alt.Chart(data).mark_area().transform_calculate(order="{'CHN': 0, 'IND': 1, 'EUandUK': 2, 'USA': 3, 'RST': 4}[datum.REGION_CODE]").encode(
     x=alt.X('DATE:T', axis=alt.Axis(format=("%B"))),
     y=alt.Y('TOTAL_CO2_MED:Q'),
     color=alt.Color('REGION_CODE:N',scale=alt.Scale(domain=['CHN', 'IND', 'EUandUK', 'USA', 'RST'], range=["#fde9d1", "#fcd08b", "#f9b382", "#e38875", "#ac7066"])),
     order='order:O'
).properties(width=800, height=400).configure_view(strokeWidth=0).configure_axis(grid=False)
base = alt.Chart(data).mark_area().transform_calculate(order="{'CHN': 0, 'IND': 1, 'EUandUK': 2, 'USA': 3, 'RST': 4}[datum.REGION_CODE]").encode(
     x=alt.X('DATE:T', axis=alt.Axis(format=("%B"))),
     y=alt.Y('TOTAL_CO2_MED:Q'),
     color=alt.Color('REGION_CODE:N',scale=alt.Scale(domain=['CHN', 'IND', 'EUandUK', 'USA', 'RST'], range=["#fde9d1", "#fcd08b", "#f9b382", "#e38875", "#ac7066"])),
     order='order:O'
).properties(width=800, height=400)

t = alt.Chart(data).mark_text().encode(
    x=alt.X('DATE:T', aggregate='median', ),
    #y=alt.Y('variety:N'),
    #detail='REGION_CODE:N',
    text=alt.Text('REGION_NAME:N'),
    y='min(TOTAL_CO2_MED):Q',
    #text='REGION_NAME:N'
)

(base+t).configure_view(strokeWidth=0).configure_axis(grid=False)

While we are at it we can also make the following graph of global emissions by sector -

global emissions

line = alt.Chart(globl).mark_line().encode(
    x='DATE:T',
    y=alt.Y('TRS_CO2_MED:Q'),
)
band = line.mark_area(opacity=0.3).encode(
    x='DATE:T',
    y=alt.Y('TRS_CO2_LOW:Q'),
    y2=alt.Y2('TRS_CO2_HIGH:Q'),
)
line+band
l = alt.Chart(globl).mark_line().transform_fold(['TRS_CO2_MED', 'IND_CO2_MED', 'PWR_CO2_MED', 'PUB_CO2_MED', 'AVI_CO2_MED', 'RES_CO2_MED']).encode(
    x='DATE:T',
    y='value:Q',
).facet(
    'key:N',columns=3
)

l